Roaring bitmaps: Implementation of an optimized software library

نویسندگان

  • Daniel Lemire
  • Owen Kaser
  • Nathan Kurz
  • Luca Deri
  • Chris O'Hara
  • François Saint-Jacques
  • Gregory Ssi Yan Kai
چکیده

Compressed bitmap indexes are used in systems such as Git or Oracle to accelerate queries. They represent sets and often support operations such as unions, intersections, differences, and symmetric differences. Several important systems such as Elasticsearch, Apache Spark, Netflix’s Atlas, LinkedIn’s Pivot, Metamarkets’ Druid, Pilosa, Apache Hive, Apache Tez, Microsoft Visual Studio Team Services and Apache Kylin rely on a specific type of compressed bitmap index called Roaring. We present an optimized software library written in C implementing Roaring bitmaps: CRoaring. It benefits from several algorithms designed for the single-instruction-multiple-data (SIMD) instructions available on commodity processors. In particular, we present vectorized algorithms to compute the intersection, union, difference and symmetric difference between arrays. We benchmark the library against a wide range of competitive alternatives, identifying weaknesses and strengths in our software. Our work is available under a liberal open-source license.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consistently faster and smaller compressed bitmaps with Roaring

Compressed bitmaps indexes are used in databases and search engines. A wide range of bitmap compression techniques has been proposed, almost all relying primarily on run-length encoding (RLE), including BBC and WAH. However, on unsorted data, we can get superior performance with a hybrid compression technique that uses both uncompressed bitmaps and packed arrays inside a two-level tree. An inst...

متن کامل

Better bitmap performance with Roaring bitmaps

Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...

متن کامل

Macromodule technology

A development of the parallel, optimal and portable software is a difficult problem. It consists of learning of libraries, programming techniques (such as optimization and paralleling), usage of the libraries and modification of written code. At present there is a set of the optimized libraries for the big number of tasks. But each library is unique, it demands studying and is optimized for spe...

متن کامل

A High Performance FPGA-Based Accelerator for BLAS Library Implementation

This paper describes the implementation and the performance analysis of a hardware accelerator for the BLAS library matrix multiplication operation. This accelerator is based on a dual-FPGA board and on an implementation BLAS software library making use of the FPGA-based hardware. In order to evaluate the performance of such a system, we implemented the matrix multiplication operation (BLAS “dg...

متن کامل

An Optimized Matrix Multiplication on ARMv7 Architecture

A sufficiently optimized matrix multiplication on embedded systems can facilitate data processing in high performance mobile measuring equipment since plenty of the kernel mathematical algorithms are based on matrix multiplication. In this paper, we propose a matrix multiplication specially optimized for ARMv7 architecture. The performance-critical differences between ARMv7 and conventional des...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Softw., Pract. Exper.

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2018